Search CORE

348 research outputs found

Nested Partially-Latent Class Models for Dependent Binary Data; Estimating Disease Etiology

Author: Deloria-Knoll Maria
Wu Zhenke
Zeger Scott
Publication venue
Publication date: 29/10/2015
Field of study

The Pneumonia Etiology Research for Child Health (PERCH) study seeks to use modern measurement technology to infer the causes of pneumonia for which gold-standard evidence is unavailable. The paper describes a latent variable model designed to infer from case-control data the etiology distribution for the population of cases, and for an individual case given his or her measurements. We assume each observation is drawn from a mixture model for which each component represents one cause or disease class. The model addresses a major limitation of the traditional latent class approach by taking account of residual dependence among multivariate binary outcome given disease class, hence reduces estimation bias, retains efficiency and offers more valid inference. Such "local dependence" on a single subject is induced in the model by nesting latent subclasses within each disease class. Measurement precision and covariation can be estimated using the control sample for whom the class is known. In a Bayesian framework, we use stick-breaking priors on the subclass indicators for model-averaged inference across different numbers of subclasses. Assessment of model fit and individual diagnosis are done using posterior samples drawn by Gibbs sampling. We demonstrate the utility of the method on simulated and on the motivating PERCH data.Comment: 30 pages with 5 figures and 1 table; 1 appendix with 4 figures and 1 tabl

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

ON THE EQUIVALENCE OF CASE-CROSSOVER AND TIME SERIES METHODS IN ENVIRONMENTAL EPIDEMIOLOGY

Author: Lu Yun
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 13/03/2006
Field of study

Time series and case-crossover methods are often viewed as competing alternatives in environmental epidemiologic studies. Several recent studies have compared the time series and case-crossover methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for over-dispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics

Collection Of Biostatistics Research Archive

Partially-Latent Class Models (pLCM) for Case-Control Studies of Childhood Pneumonia Etiology

Author: Deloria-Knoll Maria
Hammitt Laura L.
Wu Zhenke
Zeger Scott L.
Publication venue
Publication date: 31/05/2014
Field of study

In population studies on the etiology of disease, one goal is the estimation of the fraction of cases attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal, or other pathogens. The study of pneumonia etiology is challenging because directly sampling from the lung to identify the etiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. This paper introduces the statistical methodology designed for estimating the population etiology distribution and the individual etiology probabilities in the Pneumonia Etiology Research for Child Health (PERCH) study of 9; 500 children for 7 sites around the world. We formulate the scientific problem in statistical terms as estimating the mixing weights and latent class indicators under a partially-latent class model (pLCM) that combines heterogeneous measurements with different error rates obtained from a case-control study. We introduce the pLCM as an extension of the latent class model. We also introduce graphical displays of the population data and inferred latent-class frequencies. The methods are tested with simulated data, and then applied to PERCH data. The paper closes with a brief description of extensions of the pLCM to the regression setting and to the case where conditional independence among the measures is relaxed.Comment: 25 pages, 4 figures, 1 supplementary materia

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

DECOMPOSITION OF REGRESSION ESTIMATORS TO EXPLORE THE INFLUENCE OF UNMEASURED TIME-VARYING CONFOUNDERS

Author: Lu Yun
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 21/11/2007
Field of study

In environmental epidemiology, exposure X and health outcome Y vary in space and time. We present a method to diagnose the possible influence of unmeasured confounders U on the estimated effect of X on Y and to propose several approaches to robust estimation. The idea is to use space and time as proxy measures for the unmeasured factors U. We start with the time series case where X and Y are continuous variables at equally-spaced times and assume a linear model. We define matching estimator b(u)s that correspond to pairs of observations with specific lag u. Controlling for a smooth function of time, St, using a kernel estimator is roughly equivalent to estimating the association with a linear combination of the b(u)s with weights that involve two components: the assumptions about the smoothness of St and the normalized variogram of the X process. When an unmeasured confounder U exists, but the model otherwise correctly controls for measured confounders, the excess variation in b(u)s is evidence of confounding by U. We use the plot of b(u)s versus lag u, lagged-estimator-plot (LEP), to diagnose the influence of U on the effect of X on Y. We use appropriate linear combination of b(u)s or extrapolate to b(0) to obtain novel estimators that are more robust to the influence of smooth U. The methods are extended to time series log-linear models and to spatial analyses. The LEP plot gives us a direct view of the magnitude of the estimators for each lag u and provides evidence when models did not adequately describe the data

Collection Of Biostatistics Research Archive

Frequency Domain Bootstrap Methods For Time Series

Author: Hurvich Clifford M.
Zeger Scott
Publication venue: New York University Graduate School of Business Administration
Publication date: 01/02/1987
Field of study

Two frequency domain bootstrap methods for weakly stationary time series will be proposed. The motivations for the proposed methods will be discussed, and the performance of the first method will be compared with that of a recently proposed method of Swanpoel and van Wyk, in a Monte Carol study. It is found that, when applied to the problem of estimating the variance of a log spectrum estimate, all methods under consideration can sometimes perform poorly. Overall, the frequency domain method used in conjunction with automatic spectrum estimate choice criterion developed by Hurvich, is found to perform best

TRENDS IN PARTICULATE MATTER AND MORTALITY: AN APPROACH TO THE ASSESSMENT OF UNMEASURED CONFOUNDING

Author: Dominici Francesca
Janes Holly
Zeger Scott
Publication venue: Collection of Biostatistics Research Archive
Publication date: 30/03/2007
Field of study

We propose a method for diagnosing confounding bias under a model which links a spatially and temporally varying exposure and health outcome. We decompose the association into orthogonal components, corresponding to distinct spatial and temporal scales of variation. If the model fully controls for confounding, the exposure effect estimates should be equal at the different temporal and spatial scales. We show that the overall exposure effect estimate is a weighted average of the scale-specific exposure effect estimates. We use this approach to estimate the association between monthly averages of fine particles (PM2.5) over the preceding 12 months and monthly mortality rates in 113 U.S. counties from 2000-2002. We decompose the association between PM2.5 and mortality into two components: 1) the association between “national trends” in PM2.5 and mortality; and 2) the association between “local trends,” defined as county-specificdeviations from national trends. This second component provides evidence as to whether counties having steeper declines in PM2.5 also have steeper declines in mortality relative to their national trends. We find that the exposure effect estimates are different at these two spatio-temporalscales, which raises concerns about confounding bias. We believe that the association between trends in PM2.5 and mortality at the national scale is more likely to be confounded than is the association between trends in PM2.5 and mortality at the local scale. If the association at the national scale is set aside, there is little evidence of an association between 12-month exposure to PM2.5 and mortality

Collection Of Biostatistics Research Archive

ON MARGINALIZED MULTILEVEL MODELS AND THEIR COMPUTATION

Author: Griswold Michael E.
Zeger Scott L.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 15/11/2004
Field of study

Clustered data analysis is characterized by the need to describe both systematic variation in a mean model and cluster-dependent random variation in an association model. Marginalized multilevel models embrace the robustness and interpretations of a marginal mean model, while retaining the likelihood inference capabilities and flexible dependence structures of a conditional association model. Although there has been increasing recognition of the attractiveness of marginalized multilevel models, there has been a gap in their practical application arising from a lack of readily available estimation procedures. We extend the marginalized multilevel model to allow for nonlinear functions in both the mean and association aspects. We then formulate marginal models through conditional specifications to facilitate estimation with mixed model computational solutions already in place. We illustrate this approach on a cerebrovascular deficiency crossover trial

Collection Of Biostatistics Research Archive